Green Codes: Energy-Efficient Short-Range 

Communication 



Pulkit Grover and Anant Sahai 
Wireless Foundations, Department of EECS 
University of California at Berkeley, CA-94720, USA 
{pulkit, sahai} @eecs.berkeley.edu 



00 1 
O . 
O ■ 
(N ■ 



in 



CO 



> 

m 

(N 

(N 

wn 
O 
oo 
O 



X 



Abstract — A green code attempts to minimize the total en- 
ergy per-bit required to communicate across a noisy channel. 
The classical information-theoretic approach neglects the energy 
expended in processing the data at the encoder and the decoder 
and only minimizes the energy required for transmissions. Since 
there is no cost associated with using more degrees of freedom, 
the traditionally optimal strategy is to communicate at rate zero. 

In this work, we use our recently proposed model for the 
power consumed by iterative message passing. Using generalized 
sphere-packing bounds on the decoding power, we find lower 
bounds on the total energy consumed in the transmissions and 
the decoding, allowing for freedom in the choice of the rate. We 
show that contrary to the classical intuition, the rate for green 
codes is bounded away from zero for any given error probability. 
In fact, as the desired bit-error probability goes to zero, the 
optimizing rate for our bounds converges to 1. 

I. Introduction 

With the development of billion transistor chips, the range 
of communication has come down dramatically from hundreds 
of kilometers (e.g. deep space communication) to a few meters 
(e.g. ad-hoc wireless networks) or a few millimeters or even 
less (e.g. on chip communication). To communicate over 
smaller distances, the transmit power required is much smaller. 
At these distances, the energy used in transmissions can be 
comparable to that expended by the system processes. The 
small size limits the ability of these chips to dissipate heat. 
Further, the chip might be battery operated, imposing stringent 
constraints on its energy usage. It is therefore of interest 
to design coding techniques that minimize the total energy 
consumed, which includes the transmission energy as well as 
the processing energy. We refer to the coding techniques that 
minimize the total energy as green codes. 

The classical information theoretic approach finds the mini- 
mum transmission energy required to communicate reliably 
across the channel. The approach is motivated by long- 
range communication, that corresponds to power constrained 
channels. Shannon [1] first characterized the minimum energy 
required to communicate across a channel with fixed rate. The 
resulting bounds are expressed using 'waterfall' curves that 
convey the revolutionary idea that unboundedly low proba- 
bilities of bit-error are attainable using only finite transmit 
power. This characterization raises a natural question: what 
is the minimum energy required for communication that is 
free of a rate constraint? The classical approach [2] [3] gives 
the minimum transmission energy required (on average) to 



communicate one bit reliably across the channel. For example, 
for an AWGN channel of noise variance 1, this minimum 
energy is 

= 21n(2) Joules. (1) 
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Since there is no penalty associated with lower rates, it is good 
to use as many degrees of freedom as are available, and the 
optimal transmission rate is zero. 

The problem of minimizing combined transmission and 
processing energy is well studied in networks. The common 
thread in [4], [5], [6], [7], [8], [9] is that the energy consumed 
in processing the signals can be a substantial fraction of the 
total power. In [7], an information-theoretic formulation is 
considered. The authors model the processing energy by a 
constant e per unit time when the transmitter is transmitting 
(and hence, is in the 'on' state). A total of r channel uses 
are allowed, and the total energy available is r£, where £ is a 
constant. Let Pi be the transmit power at i-th time instant, and 
let C(Pi) be the capacity of the corresponding channel. Then 
the problem is to transmit maximum number of bits with the 
total power less than r£. That is, 



max 



UC'(Pi) 



i=l 



subject to V" lj(Pj + e) < r£ 



(2) 



(3) 



i=i 



where 1 j = 1 if a symbol is transmitted in the i-th channel use, 
and is otherwise. This is equivalent to dividing the channel 
into r sub-channels, with independent coding on each sub- 
channel. Since the capacity function C(P) is concave in its 
argument, for maximizing the total number of information bits 
communicated, the transmission energy Pi should be equal 
for all i where 1; = 1. Without accounting for the energy 
consumed by the system processes, the optimal strategy would 
be to use all the r parallel channels, and share the energy 
equally amongst them. However, the energy consumed by the 
system processes imposes a fixed penalty on each channel use. 
The authors quantify this tension by measuring 'burstiness' 8 
of signaling defined as 9 = - Yli=i 

The transmissions should not be too bursty because of the 
law of diminishing returns associated with the log(-) function. 
On the other hand, the transmission strategy should not make 
use of all degrees of freedom either, since there is an e cost 



associated with the use of each degree of freedom. The authors 
conclude that for minimum total energy, < G < 1 . Contrary 
to conventional information theoretic wisdom, it is no longer 
optimal to use all available degrees of freedom. Consequently, 
the optimal rate that minimizes the total energy consumption 
is bounded away from zero. That is, processing energy is 
taken into account, green codes may not communicate at zero 
rate! 

The objective in [7] [5] [9] is to reduce the energy consump- 
tion for wireless devices that consume energy continuously 
when operating e.g. hand-held computers, high-end laptops, 
etc. Energy consumption per unit time for such devices is 
indeed well modeled by a constant possibly independent 
of the coding strategy being used. In this paper, we are 
interested in the energy expended by the decoding process 
itself. The decoding circuit requires some non-zero energy to 
perform each operation. As opposed to energy consumed by 
system processes in [7], [5], [9], the decoding energy depends 
significantly on the code construction, the rate and the desired 
error probability, and therefore needs more careful modeling. 

In this work, we study explicit models of energy expended at 
the decoder. Owing to their low implementation complexity, 
and hence low energy consumption, we concentrate on the 
message passing decoder. For this decoder, we derive lower 
bounds on the combined transmission and decoding energy, 
with no constraint on the rate. We show that the optimizing 
rate for green codes based on message passing decoding is 
indeed bounded away from zero. As the error probability 
decreases to zero, the optimizing rate increases. In a result that 
is qualitatively different from those in [7], we show that there 
is no advantage in increasing the rate beyond 1. Therefore, 
as the error probability converges to zero, the optimizing rate 
converges to 1 ! 

The organization of the paper is as follows : In Section [Til 
we introduce the channel model, the decoder model, and 
the energy model. In Section Hill we summarize some of 
our results in [10]. In Section IIVI we build on the results 
in [10] to find bounds on the minimum total energy required to 
communicate across a channel, with no rate constraint, taking 
into account the decoding energy as well. We conclude in 
Section [Vj 

II. System model 

Consider a point-to-point communication link. An informa- 
tion sequence B^" is encoded into 2 mR codeword X" 1 , using a 
possibly randomized encoder. The observed channel output is 
Y™. The information sequences are assumed to consist of iid 
fair coin tosses and hence the rate of the code is R = k/m. 

The channel model considered is an average power con- 
strained AWGN channel of noise variance er|,. We also obtain 
some results for the BSC arising from performing hard- 
decision on BPSK symbols transmitted over an AWGN chan- 
nel. The true channel is denoted by V. The channel capacity is 
denoted by C a i (Pt), where a 2 is the noise variance, and Ft 
is the average power constraint. We drop a 2 from this notation 
when no ambiguity is created in doing so. 



For maximum generality, we do not impose any a priori 
structure on the code itself. Instead, inspired by [11], [12], 
[13], we focus on the parallelism of the decoder and the energy 
consumed within it. We assume that the decoder is physically 
made of computational nodes that pass messages to each other 
in parallel along physical (and hence unchanging) wires. A 
subset of nodes are designated 'message nodes' in that each 
is responsible for decoding the value of a particular message 
bit. Another subset of nodes (not necessarily disjoint), called 
the 'observation nodes' has members that are each initialized 
with at most one observation of the received channel output 
symbols. There may be additional computational nodes to 
merely help in decoding. 

The implementation technology is assumed to dictate that 
each computational node is connected to at most a + 1 > 2 
other nodesS with bidirectional wires. No other restriction is 
assumed on the topology of the decoder. In each iteration, each 
node sends (possibly different) messages to all its neighboring 
nodes. No restriction is placed on the size or content of 
these messages except for the fact that they must depend 
only on the information that has reached the computational 
node in previous iterations. If a node wants to communicate 
with a more distant node, it has to have its message relayed 
through other nodes. The neighborhood size at the end of I 
iterations is denoted by n < a l+1 . Each computational node 
is assumed to consume a fixed E no d e joules of energy at each 
iteration. 

Let the average probability of bit error of a code be denoted 
by (P e ) when it is used over channel V. The main tool is 
a lower bound on the neighborhood size n as a function of 
(P e ) and R. This then translates into a lower bound on the 
number of iterations that can in turn be used to lower bound 
the required decoding power. 

Throughout this paper, we allow the encoding and decoding 
to be randomized with all computational nodes allowed to 
share a pool of common randomness. We use the term 'average 
probability of error' to refer to the probability of bit error 
averaged over the channel realizations, the messages, the 
encoding, and the decoding. 

III. Lower bounds on the decoding complexity 

AND TOTAL ENERGY 

In this section we summarize our results for lower bounds 
on decoding complexity for an AWGN channel from [10]. 
The main bounds are given by theorems that capture a local 
sphere -packing effect. These can be turned around to give 
a family of lower bounds on the neighborhood size n as a 
function of (P e ) and R. Using a simple lower bound on the 
number of iterations, I > |°g[^] — L we get a lower bounc0 
on complexity. The family of lower bounds is indexed by the 
choice of a hypothetical channel Q and the bounds can be 
optimized numerically for any desired set of parameters. 

'in practice, this limit could come from the number of metal layers on a 
chip, a = 1 would just correspond to a big ring of nodes and is therefore 
uninteresting. 

2 We approximate this by / > }°g|^j for the rest of the paper. 



Theorem 3.1: For the AWGN channel and the decoder 
model in SectionQ]] let n be the maximum size of the decoding 
neighborhood of any individual message bit. The following 
lower bound holds on the average probability of bit error. 
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where 5(ag) = 

ilog 2 (l + ^), and the KL divergence D(a 2 g\\<j^) 
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Proof: See [10]. There is a better bound in [10] as well. 
This bound is presented here for ease of exposition. ■ 
Observe that the required value of n increases as (P e ) de- 
creases. Taking log on both sides of (0J, it is evident that for 
small (P e ), the term nD(ag\\a 2 7 ) dominates the other terms 
in the RHS. For small (P e ), <Jg can be taken close to cr*g 2 that 
satisfies C^iPr) = R- Neglecting the other two terms, we 
get 



> 
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(5) 



IV. Minimization of total energy by optimizing 

OVER THE RATE AND TRANSMIT POWER. 

Consider the total energy spent in transmission. For trans- 
mitting k bits at rate R, the number of channel uses is 
to — k/R. If each transmission has power £,tPt, the total 
energy used in the transmissions is ^Prm- 

At the decoder, let the number of iterations be /. Assume that 
each node consumes E no d e joules of energy in each iteration. 
The number of computational nodes can be lower bounded 
by to, the number of received channel outputs, and also by 
k, the number of bits to be decoded. We lower bound by the 
maximum of the twc0 



Edec > E node x max{fc, m} x I. 



(6) 



There is no lower bound on the encoding complexity and so 
the encoding is considered free. For to transmissions with 
average power Pt, we require mPr joules of energy. This 
results in the following bound for the weighted total energy^ 



Etotai > £rmP T +C-D £, nod e max{fc,TO} X I. 



(7) 



3 A lower bound of m + k would not allow for node sharing between the 
set of observation nodes and the message nodes. 

4 The parameters £t and £d are weights assigned to the transmit and the 
decoding energy respectively. £t depends on the path-loss across the channel. 
§j5 indicates the relative importance of decoding energy. For example, if the 
energy use at the decoder is severely constrained, £e> would be large. 



Using I > gg}. 

Etotai > 



tu^tPt 
mPr 



£p E node max{fc, to} log(rj.) 
log(a) 

7 max{fc, to} log(n), 



(8) 



where 7 



OpfT log(a) 



is a constant that summarizes all 



the technological and environmental terms. The expression 
in ((HJ gives the normalized total energy, normalized by the 
noise variance <x?,. Figure Q] provides example^ behavior of 7 
with distance. The neighborhood size n itself can be lower 
bounded by plugging the desired average probability of error 
into Theorem 13. II 
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1. The plot shows the behavior of 7 with distance d for path loss 
= for d > 0.1mm (and path loss 1 for smaller d). E no de is lpJ> 
4, £p = 1, and (t|, = 4 X 10 — 21 J. The energy per bit is normalized 



E, 



per bit 



min — — 7T + — 7 max < — , 1 > log(n) . 
p t m Rai " 



(9) 



We thus obtain the following expression for the minimum 
normalized total energy, 

if t 1 r 1 

+ -7max 
j-p it yit 

Observe that in ©, the decoding energy increases as the error 
probability decreases for constant transmit power and rate. 
This behavior is not reflected by using the model inspired 
from [7] for decoding energy. The bounds in [7] are for error 
probability converging to zero. To compare our bounds with 
the black-box model of [7], in Appendix H] we derive bounds 
for non-zero error probability based on the model in [7]. 
We plot the two bounds against each other in Figure [2] for 
k = 10, 000 bits. 

We choose e = 4, for which the total energy per bit for 
the black-box model equals the energy per bit for 7 = 0.2 
for our bound for (P e ) — 10~ 13 . The figure shows that for 



The energy cost of one iteration at one node E no ^ £ pb 1 pj is arrived at 
by an optimistic extrapolation from the reported values in [14], [15], thermal 
noise energy per sample u|, fs 4 X 10~ 21 J from kT with T around room 
temperature. 



(P e ) smaller than this threshold, the model inspired from [7] 
underestimates the total energy. It is because this model treats 
the decoder as a black-box where e does not change with error 
probability or rate. 

It is interesting to observe what values of R optimize 
Under the small (P e ) approximation in ((5]), we now heuristi- 
cally argue that the optimal rate i? G pt should converge to 1 as 

(Pe>-0. 

Observe that for R < 1, 

Pt 



E, 



p T 7 
^i? + i? l0g2 
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1 
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log 2 (n) 
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As (P e ) — > 0, n — > oo. Therefore, the decoding energy 
increases to infinity. Increasing the rate R at the cost of 
increasing Pt offsets the increasing decoding costs. However, 
for R > 1, 



E u- t > 
-^per bit ~ 



,R 



7 log 



'!og 2 (tj%) 



D(<J* g 2 \ 
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(10) 



which indicates there is no advantage in increasing rate beyond 
R = 1, since it no longer decreases the decoding energy. 

Evidently, for finite (P e ), there exists an optimal rate 
R opt > that minimizes the combined energy consumed. 
Using numerical evaluation of the bound (0, we plot the 
behavior of the optimal rate with (P e ) in Figure [5] The plots 
demonstrate that the optimal rate indeed converges to 1. 

Figure [3] shows the behavior of our lower bound on sum 
energy with (P e ) for various values of 7. Figure |4] shows that 
similar behavior also holds for a BSC arising from performing 
hard-decision on BPSK symbols transmitted over an AWGN 
channel. The optimal rate for this channel also converges to 1 
as (P e ) — ► 0. Due to lack of space, we omit the plots. 




Fig. 3. The plot shows the behavior of lower bound on the normalized sum 
energy with (P e ) for various values of 7. The sum energy goes to infinity as 

(Pe) -0. 
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Fig. 2. The plot shows the comparison of lower bounds on the minimum 
normalized energy for k = 10, 000 bits. The 'black-box bounds' plot is based 
on the model in [7], where the details of the processor are ignored. Our bounds 
take into account the decoder structure as well. 



Fig. 4. The plot shows the behavior of lower bound on normalized sum 
energy with (P e ) for various values of 7 for a BSC arising from performing 
hard-decision on BPSK symbols transmitted over an AWGN channel. The 
optimizing rate converges to 1 as (Pe) — * 0. Even so, this plot shows that 
the optimal strategy is not uncoded transmission at low (Pe) since coded 
transmission outperforms uncoded transmission at small (P e ). 



V. Discussions and Conclusions 

In this work, we derived lower bounds on the combined 
transmission and decoding energy for iterative decoding with 
unconstrained rates. It is important to note that these are lower 
bounds, and the actual energy consumption would only be 
higher. An interesting feature of the our bounds is that the 
optimizing rate for green codes is bounded away from zero, 
and, in fact, converges to 1 as the error probability converges 
to zero. This is qualitatively different from a pure black-box 
modeling of the decoding process, where energy consumption 
is independent of the desired error probability and the rate. In 
that case, as observed in [7], the optimal rate is a constant that 
can be greater than 1. 
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Fig. 5. Optimal value of rate vs error probability: As (Pe) converges to 0, 
the optimizing rate converges extremely slowly to 1. 



For an AWGN channel, the value 1 for optimal rate is a 
result of a bit-wise representation of the information at the 
decoder. If, however, the message nodes represent the infor- 
mation in base M then the optimizing rate would converge to 

log 2 (M). 

For the BSC arising from performing hard-decision on 
BPSK symbols transmitted over an AWGN channel, the opti- 
mal rate still converges to 1. The rate is upper bounded by 1 
because the channel has binary input alphabet, and thus this 
case might seem somewhat uninteresting. However, uncoded 
transmission over BSC also corresponds to rate 1, which might 
falsely suggest that uncoded transmission is asymptotically 
optimal for minimizing the total energy. We observe that 
despite the optimal rate approaching 1, coded transmission 
attains the same error probability with much smaller total 
energy than uncoded transmission. 

We note that the total energy per-bit required to commu- 
nicate at arbitrarily low error probability increases to infinity 
for the message passing decoder. This is in contrast to the 
classical information-theoretic result for transmit power, which 
shows that the transmit power is bounded even as (P e ) — > 0. 
Based on results in [10], the total energy per bit increases 
to infinity for most known codes and decoding algorithms. 
It would be interesting to extend this result to all possible 
codes and decoding algorithms. An approach based on laws 
of physics is suggested in [10] for the fixed rate problem. The 
approach might yield results here as well. 

Appendix I 

Bounds in [7] for non-zero error probability 

Observe that the results in [7] are for (P e ) — > and 
infinitely many information bits. Parallel to our analysis for 
message passing decoding, in this appendix, we build on 
the analysis in [7] to derive bounds on the minimum energy 
required for communicating with a non-zero error probability 
(P e ) and finite information bits. 



Assume k bits are to be transmitted across the chan- 
nel, with desired error probability (P e )- In [7], the authors 
maximize the information bits communicated under a total 
energy constraint. Turning around the problem in [7], we can 
instead minimize the total energy consumed given the number 
of bits transmitted. Now we can add an error probability 
constraint to the bits transmitted. Assume that a block code is 
used to communicate across the channel. The corresponding 
error exponent is bounded by the sphere-packing bound [16]. 
Assuming optimistically that the code actually achieves the 
sphere-packing bound in the exponent, 

-mE 3p {P T ,R) 



(Pe) < P t 



e,block 



where E sp (Pr, R) is the sphere-packing bound at rate R and 
transmit power Pt- The objective, therefore, is 



min m x (Pt + e) 

Pt ,m 



subject to m x E sp Pt, 
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